Characterizing and predicting agents via multi-agent evolution

ABSTRACT

A method of predicting the behavior of software agents in a simulated environment involving modeling a plurality of software agents representing entities to be analyzed, which may be human beings. Using a set of parameters that governs the behavior of the agents, the internal state of at least one of the agents is estimated by its behavior in the simulation, including its movement within the environment. This facilitates a prediction of the likely future behavior of the agent based solely upon its internal state; that is, without recourse to any intentional agent communications. In one embodiment, the simulated environment is based upon a digital pheromone infrastructure. The simulation integrates knowledge of threat regions, a cognitive analysis of the agent&#39;s beliefs, desires, and intentions, a model of the agent&#39;s emotional disposition and state, and the dynamics of interactions with the environment.

This application is a continuation of and claim priority to U.S.application Ser. No. 11/548,909, which claims benefit of and priority toU.S. Provisional Application No. 60/725,854, filed Oct. 12, 2005, and isentitled to that filing date for priority. The specification, figuresand complete disclosures of U.S. Provisional Application No. 60/725,854and application Ser. No. 11/548,909 are incorporated herein by specificreference for all purposes.

This application is based in part upon work supported by the DefenseAdvanced Research Projects Agency (DARPA) under Contract No.NBCHC040153. Any opinions, findings and conclusions or recommendationsexpressed in this material are those of the inventors and do notnecessarily reflect the views of the DARPA or the Department ofInterior-National Business Center DOI-NBC). Distribution Statement “A”(Approved for Public Release, Distribution Unlimited).

FIELD OF INVENTION

This invention relates generally to agent behavior and, in particular,to a system and method that characterizes an agent's internal state byevolution against observed behavior, and predicts future behavior,taking into account the dynamics of agent interaction with theirenvironment.

BACKGROUND OF THE INVENTION

Reasoning about agents that we observe in the world must integrate twodisparate levels. Our observations are often limited to the agent'sexternal behavior, which can frequently be summarized: numerically as atrajectory in space-time (perhaps punctuated by actions from a fairlylimited vocabulary). However, this behavior is driven by the agent'sinternal state, which (in the case of a human) may involve high-levelpsychological and cognitive concepts such as intentions and emotions. Acentral challenge in many application domains is reasoning from externalobservations of agent behavior to an estimate of their internal state.Such reasoning is motivated by a desire to predict the agent's behavior.Work to date focuses almost entirely on recognizing the rational state(as opposed to the emotional state) of a single agent (as opposed to aninteracting community), and frequently takes advantage of explicitcommunications between agents (as in managing conversational protocols).

It is increasingly common in agent theory to describe the cognitivestate of an agent in terms of its beliefs, desires, and intentions (theso-called “BDI” model [4, 15]). An agent's beliefs are propositionsabout the state of the world that it considers true, based on itsperceptions. Its desires are propositions about the world that it wouldlike to be true. Desires are not necessarily consistent with oneanother: an agent might desire both to be rich and not to work at thesame time. An agent's intentions, or goals, are a subset of its desiresthat it has selected, based on its beliefs, to guide its future actions.Unlike desires, goals must be consistent with one another (or at leastbelieved to be consistent by the agent).

An agent's goals guide its actions. Thus one ought to be able to learnsomething about an agent's goals by observing its past actions, andknowledge of the agent's goals in turn enables conclusions about whatthe agent may do in the future.

There is a considerable body of work in the AI and multi-agent communityon reasoning from an agent's actions to the goals that motivate them.This process is known as “plan recognition” or “plan inference.” Arecent survey is available at [2]. This body of work is rich and varied.It covers both single-agent and multi-agent (e.g., robot soccer team)plans, intentional vs. non-intentional actions, speech vs. non-speechbehavior, adversarial vs. cooperative intent, complete vs. incompleteworld knowledge, and correct vs. faulty plans, among other dimensions.

Plan recognition is seldom pursued for its own sake. It usually supportsa higher-level function. For example, in human-computer interfaces,recognizing a user's plan can enable the system to provide moreappropriate information and options for user action. In a tutoringsystem, inferring the student's plan is a first step to identifyingbuggy plans and providing appropriate remediation. In many cases, thehigher-level function is predicting likely future actions by the entitywhose plan is being inferred.

Many realistic problems deviate from these conditions:

-   -   Increasing the number of agents leads to a combinatorial        explosion of possibilities that can swamp conventional analysis.    -   The dynamics of the environment can frustrate the intentions of        an agent.    -   The agents often are trying to hide their intentions (and even        their presence), rather than intentionally sharing information.    -   An agent's emotional state may be at least as important as its        rational state in determining its behavior.

Domains that exhibit these constraints can often be characterized asadversarial, and include military combat, competitive business tactics,and multi-player computer games.

SUMMARY OF INVENTION

In various embodiments, the present invention comprises a method ofpredicting the behavior of software agents in a simulated environment.The method involves modeling a plurality of software agents representingentities to be analyzed, which may be human beings. Using a set ofparameters that governs the behavior of the agents, the internal stateof at least one of the agents is estimated by its behavior in thesimulation, including its movement within the environment. Thisfacilitates a prediction of the likely future behavior of the agentbased solely upon its internal state; that is, without recourse to anyintentional agent communications.

In one embodiment, the simulated environment is based upon a digitalpheromone infrastructure. The digital pheromones are scalar variablesthat agents can sense and which they deposit at their current locationin the environment. The agents respond to the local concentrations ofthe digital pheromones tropistically through climbing or descendinglocal gradients. The pheromone infrastructure runs on the nodes of agraph-structured environment, preferably a rectangular lattice. Eachagent is capable of aggregating pheromone deposits from individualagents, thereby fusing information across multiple agents over time.Each agent is further capable of evaporating pheromones over time toremove inconsistencies that result from changes in the simulation, anddiffusing pheromones to nearby places, thereby disseminating informationfor access by nearby agents.

By reasoning from an entity's observed behavior, this invention iscapable of providing an estimate of the entity's internal state, andextrapolating that estimate into a prediction of the entity's likelyfuture behavior. The system and method, called BEE (Behavioral Evolutionand Extrapolation), performs these and other tasks using afaster-than-real-time simulation of lightweight swarming agents,coordinated through digital pheromones. This simulation integratesknowledge of threat regions, a cognitive analysis of the agent'sbeliefs, desires, and intentions, a model of the agent's emotionaldisposition and state, and the dynamics of interactions with theenvironment. By evolving agents in this rich environment, their internalstate can be fitted to their observed behavior. In realistic wargamescenarios, the system successfully detects deliberately played emotionsand makes reasonable predictions about the entities' future behavior.

DESCRIPTION OF THE DRAWINGS

FIG. 1 is a graphic model of a tracking nonlinear dynamical systemwherein a=system state space; b=system trajectory over time; c=recentmeasurements of system state; and d=short-range prediction.

FIG. 2 is a diagram of a Behavioral Emulation and Extrapolation (BEE)Integrated Rational and Emotive Personality Model.

FIG. 3 is graphical representation of an exemplary embodiment of the BEEmodel, wherein each avatar generates a stream of ghosts that sample thepersonality space of the entity it represents. They evolve against theobserved behavior of the entity in the recent past, and the fittestghosts then run into the future to generate predictions.

FIG. 4 is a Delta Disposition chart for a “Chicken's Ghosts” embodiment.

FIG. 5 is a Delta Disposition chart for a “Rambo” embodiment.

FIG. 6 shows a table for evaluating predictions, where each rowcorresponds to a successive prediction for a given unit, and each columnto a time in the real world that is covered by some set of thesepredictions. The shaded cells show which predictions cover which timeperiods. Each cell (a) contains the location error, that is, how far theunit is at the time indicated by the column from where the predictionindicated by the row said it would be. One can average these errorsacross a single prediction (b) to estimate the prospective accuracy of asingle prediction, across a single time (c) to estimate theretrospective accuracy of all previous predictions referring to a giventime, or across a given offset from the start of the prediction (d) toestimate the horizon error, i.e, how prediction accuracy varies withlook-ahead depth.

FIG. 7 shows a graphic representation of path characteristics: angle θ,straight-line radius ρ, and actual length λ.

FIG. 8 shows graphs for exemplary stepwise metrics, including, from leftto right, average prospective, retrospective, and horizon error. Thethin line is the average of metrics from 100 random walks. The verticalline indicates when the unit dies. Since these are error curves, loweris better.

FIG. 9 shows graphs for exemplary component metrics. The thin line isthe random baseline. Since these metrics indicate degree of agreementbetween prediction and baseline, higher is better.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

In one exemplary embodiment, the present system provides a BehavioralEvolution and Extrapolation (BEE) method and approach to addressing therecognition of the rational and emotional state of multiple interactingagents based solely on their behavior, without recourse to intentionalcommunications from them. It is inspired by techniques used to predictthe behavior of nonlinear dynamical systems, in which a representationof the system is continually fit to its recent past behavior. In suchanalysis of nonlinear dynamical systems, the representation takes theform of a closed form mathematical equation. In BEE, it takes the formof a set of parameters governing the behavior of software agentsrepresenting the individuals being analyzed.

In contrast to previous research in AI (plan recognition) and nonlineardynamics systems (trajectory prediction), embodiments of the presentinvention focus on plan recognition in support of prediction. An agent'splan is a necessary input to a prediction of its future behavior, buthardly a sufficient one. At least two other influences, one internal andone external, need to be taken into account.

The external influence is the dynamics of the environment, which mayinclude other agents. The dynamics of the real world impose significantconstraints. The environment is autonomous (it may do things on its ownthat interfere with the desires of the agent) [3, 8]. Most interactionsamong agents, and between agents and the world, are nonlinear. Wheniterated, these can generate rapid divergence of trajectories (“chaos,”sensitivity to initial conditions).

A rational analysis of an agent's goals may enable one to predict whatit will attempt, but any nontrivial plan with several steps will dependsensitively at each step to the reaction of the environment, andpredictions must take this into account as well. Actual simulation offutures is one way to deal with these.

In the case of human agents, an internal influence also comes into play.The agent's emotional state can modulate its decision process and itsfocus of attention (and thus its perception of the environment). Inextreme cases, emotion can lead an agent to choose actions that from thestandpoint of a logical analysis may appear irrational.

Current work on plan recognition for prediction focuses on the rationalplan, and does not take into account either external environmentalinfluences or internal emotional biases. BEE integrates all threeelements into its predictions.

Real-Time Fitting in Nonlinear Systems Analysis

Many systems of interest can be described in terms of a vector of realnumbers that changes as a function of time. The dimensions of the vectordefine the system's state space. Notionally, one typically analyzes suchsystems as vector differential equations, e.g., dx/dt=ƒ(x).

When ƒ is nonlinear, the system can be formally chaotic, and startingpoints arbitrarily close to one another can lead to trajectories thatdiverge exponentially rapidly, becoming uncorrelated. Long-rangeprediction of the behavior of such a system is impossible in principle.However, it is often useful to anticipate the system's behavior a shortdistance into the future. To do so, a common technique is to fit aconvenient functional form for ƒ to the system's trajectory in therecent past, and then extrapolate this fit into the future, as seen inFIG. 1. [6] This process is repeated constantly, in real time, providingthe user with a limited look-ahead into the system's future.

While this approach is robust and widely applied, it requires systemsthat can efficiently be described in terms of mathematical equationsthat can be fit using optimization methods such as least squares. BEEapplies this approach to agent behaviors, which it fits to observedbehavior using a genetic algorithm.

Architecture

BEE predicts the future by observing the emergent behavior of agentsrepresenting the entities of interest in a fine-grained agentsimulation. Key elements of the BEE architecture include the model of anindividual agent, the pheromone infrastructure through which agentsinteract, the information sources that guide them, and the overallevolutionary cycle that they execute.

Agent Model

The agents in BEE are inspired by two bodies of work. The first is ourown previous work on fine-grained agents that coordinate their actionsstigmergically, through digital pheromones in a shared environment [1,11, 13, 14, 16]. The second inspiration is the success of previousagent-based combat modeling in EINSTein and MAUI.

Digital pheromones are scalar variables that agents deposit at theircurrent location in the environment, and that they can sense. Agentsrespond to the local concentrations of these variables tropistically,typically climbing or descending local gradients. Their movements inturn change the deposit patterns. This feedback loop, together withprocesses of evaporation and propagation in the environment, can supportcomplex patterns of interaction and coordination among the agents [12].Table 1 shows the pheromone flavors currently used in the BEE. Inaddition, ghosts take into account their distance from distinguishedstatic locations, a mechanism that we call “virtual pheromones,” sinceit has the same effect as propagating a pheromone field from such alocation, but with lower computational costs.

TABLE 1 PHEROMONE FLAVORS IN RAID RedAlive Emitted by a living or deadentity of RedCasualty the appropriate group Blue Alive (Red = enemy,Blue = friendly, Green = neutral) BlueCasualty GreenAlive GreenCasualtyWeapons Fire Emitted by a firing weapon KeySite Emitted by a site ofparticular importance to Red Cover Emitted by locations that affordcover from fire Mobility Emitted by roads and other structures thatenhance agent mobility RedThreat Determined by external process BlueThreat

The use of agents to model combat is inspired by EINSTein and MAUI.EINSTein [5] represents an agent as a set of six weights, each in [−1,1], describing the agent's response to six kinds of information. Four ofthese describe the number of alive friendly, alive enemy, injuredfriendly, and injured enemy troops within the agent's sensor range. Theother two weights relate to the model's use of a childhood game,“capture the flag,” as a prototype of combat. Each team has a flag, andseeks to protect it from the other team while capturing the other team'sflag. The fifth and sixth weights describe how far the agent is from itsown and its adversary's flag. A positive weight indicates that the agentis attracted to the entity described by the weight, while a negativeweight indicates that it is repelled.

MANA [7] extends the concepts in EINSTein. Friendly and enemy flags arereplaced by the waypoints being pursued by each side. MANA includes fouradditional components: low, medium, and high threat enemies. Inaddition, it defines a set of triggers (e.g., reaching a waypoint, beingshot at, making contact with the enemy, being injured) that shift theagent from one personality vector to another. A default state definesthe personality vector when no trigger state is active.

The personality vectors in MANA and EINSTein reflect both rational andemotive aspects of decision-making. The notion of being attracted orrepelled by friendly or adversarial forces in various states of healthis an important component of what we informally think of as emotion(e.g., fear, compassion, aggression), and the use of the term“personality” in both EINSTein and MANA suggests that the systemdesigners are thinking anthropomorphically, though they do not use“emotion” to describe the effect they are trying to achieve. The notionof waypoints to which an agent is attracted reflects goal-orientedrationality.

BEE embodies an integrated rational-emotive personality model. In oneembodiment, a BEE agent's rationality is modeled as a vector of sevendesires, which are values in [−1, +1]: ProtectRed (the adversary),ProtectBlue (friendly forces), ProtectGreen (civilians),ProtectKeySites, AvoidCombat, AvoidDetection, and Survive. Negativevalues reverse the sense suggested by the label. For example, a negativevalue of ProtectRed indicates a desire to harm Red.

Table 2 shows which pheromones A(ttract) or R(epel) an agent with agiven desire, and how that tendency translates into action.

The emotive component of a BEE's personality is based on theOrtony-Clore-Collins (OCC) framework [9], and described in detailelsewhere [10]. OCC define emotions as “valanced reactions to agents,states, or events in the environment.” This notion of reaction iscaptured in MANA's trigger states. An important advance in BEE'semotional model with respect to MANA and EINSTein is the recognitionthat agents may differ in how sensitive they are to triggers. Forexample, threatening situations tend to stimulate the emotion of fear,but a given level of threat will produce more fear in a new recruit thanin a seasoned combat veteran. Thus, the present model includes not onlyEmotions, but Dispositions. Each Emotion has a correspondingDisposition. Dispositions are relatively stable, and considered constantover the time horizon of a run of the BEE, while Emotions vary based onthe agent's disposition and the stimuli to which it is exposed.

Based on interviews with military domain experts we identified the twomost crucial emotions for combat behavior as Anger (with thecorresponding disposition Irritability) and Fear (whose disposition isCowardice). Table 3 shows which pheromones trigger which emotions.Emotions are modeled as agent hormones (internal pheromones) that areaugmented in the presence of the triggering environmental condition andevaporate over time.

TABLE 3 INTERACTIONS OF PHEROMONES AND DISPOSITIONS/EMOTIONS RedPerspective Blue Perspective Green Perspective Irritability/ Cowardice/Irritability/ Cowardice/ Irritability/ Cowardice/ Anger Fear Anger FearAnger Fear Pheromone RedAlive X X RedCasualty X X BlueAlive X X X XBlueCasualty X X GreenCasualty X X X X WeaponsFire X X X X X X KeySitesX X

The effect of a non-zero emotion is to modify actions. An elevated levelof Anger will increase movement likelihood, weapon firing likelihood,and tendency toward an exposed posture. An increasing level of Fear willdecrease these likelihoods.

FIG. 2 summarizes one embodiment of the BEE's personality model. Theleft two columns are a straightforward BDI model (where we prefer theterm “goal” to “intention”). The right-hand column is the emotivecomponent, where an appraisal of the agent's beliefs, moderated by thedisposition, leads to an emotion that in turn influences the BDIanalysis.

The BEE Cycle

A major innovation in BEE is an extension of the nonlinear systemstechnique described herein to characterize agents based on their pastbehavior and extrapolate their future behavior based on thischaracterization. This section describes this process at a high level,then discusses in more detail the multi-page pheromone infrastructurethat implements it.

Overview

FIG. 3 is an overview of one embodiment of the BEE process. Each activeentity in the battlespace has an avatar that continuously generates astream of ghost agents representing itself. Ghosts live on a timelineindexed by τ that begins in the past at the insertion horizon and runsinto the future to the prediction horizon. τ is offset with respect tothe current time t in the domain being modeled. The timeline is dividedinto discrete “pages,” each representing a successive value of τ. Theavatar inserts the ghosts at the insertion horizon. In our currentsystem, the insertion horizon is at τ−t=−30, meaning that ghosts areinserted into a page representing the state of the world 30 minutes ago.At the insertion horizon, each ghost's behavioral parameters (desiresand dispositions) are sampled from distributions to explore alternativepersonalities of the entity it represents.

Each page between the insertion horizon and τ=t (“now,” the pacecorresponding to the state of the world at the current domain time)records the historical state of the world at the point in the past towhich it corresponds. As ghosts move from page to page, they interactwith this past state, based on their behavioral parameters. Theseinteractions mean that their fitness depends not just on their ownactions, but also on the behaviors of the rest of the population, whichis also evolving. Because τ advances faster than real time, eventuallyτ=t (actual time). At this point, each ghost is evaluated based on itslocation compared with the actual location of its correspondingreal-world entity.

The fittest ghosts have three functions:

1. The personality of the fittest ghost for each entity is reported tothe rest of the system as the likely personality of the correspondingentity. This information enables us to characterize individual warriorsas unusually cowardly or brave.

2. The fittest ghosts are bred genetically and their offspring arereintroduced at the insertion horizon to continue the fitting process.

3. The fittest ghosts for each entity form the basis for a population ofghosts that are allowed to run past the avatar's present into thefuture. Each ghost that is allowed to run into the future explores adifferent possible future of the battle, analogous to how some peopleplan ahead by mentally simulating different ways that a situation mightunfold. Analysis of the behaviors of these different possible futuresyields predictions.

A review of this process shows that BEE has three distinct notions oftime, all of which may be distinct from real-world time.

1. Domain time t is the current time in the domain being modeled. Thistime may be the same as real-world time, if BEE is being applied to areal-world situation. In our current experiments, we apply BEE to abattle taking place in a simulator, the OneSAF Test Bed (OTB), anddomain time is the time stamp published by OTB. During actual runs, OTBis often paused, so domain time runs slower than real time. When wereplay logs from simulation runs, we can speed them up so that domaintime runs faster than real time.

2. BEE time τ for a specific page records the domain time correspondingto the state of the world represented on that page, and is offset fromthe current domain time.

3. Shift time is incremented every time the ghosts move from one page tothe next. The relation between shift time and real time depends on theprocessing resources available.

Pheromone Infrastructure

BEE must operate very rapidly in order to keep pace with an ongoingevolution of a battle or other complex situation. Thus we use simpleagents coordinated using pheromone mechanisms. We have described thebasic dynamics of our pheromone infrastructure elsewhere [1]. Thisinfrastructure runs on the nodes of a graph-structured environment (inthe case of BEE, a rectangular lattice). Each node maintains a scalarvalue for each flavor of pheromone, and provides three functions:

1. It aggregates deposits from individual agents, fusing informationacross multiple agents and through time.

2. It evaporates pheromones over time. This dynamic is an innovativealternative to traditional truth maintenance in artificial intelligence.Traditionally, knowledge bases remember everything they are told unlessthey have a reason to forget something, and expend large amounts ofcomputation in the NP-complete problem of reviewing their holdings todetect inconsistencies that result from changes in the domain beingmodeled. Ants immediately begin to forget everything they learn, unlessit is continually reinforced. Thus inconsistencies automatically removethemselves within a known period.

3. It diffuses pheromones to nearby places, disseminating informationfor access by nearby agents.

The distribution of each pheromone flavor over the environment forms ascalar field that represents some aspect of the state of the world at aninstant in time. Each page of the timeline discussed in the previoussection is a complete pheromone field for the world at the BEE time τrepresented by that page. The behavior of the pheromones on each pagedepends on whether the page represents the past or the future.

In pages representing the future (τ>t), the usual pheromone mechanismsapply. Ghosts deposit pheromone each time they move to a new page, andpheromones evaporate and propagate from one page to the next.

In pages representing the domain past (τ.ltoreq. t), one has an observedstate of the real world. This has two consequences for pheromonemanagement. First, we can generate the pheromone fields directly fromthe observed locations of individual entities, so there is no need forthe ghosts to make deposits. Second, we can adjust the pheromoneintensities based on the changed locations of entities from page topage, so we do not need to evaporate or propagate the pheromones. Bothof these simplifications reflect the fact that in our current system, wehave complete knowledge of the past. When we introduce noise anduncertainty, we will probably need to introduce dynamic pheromones inthe past as well as the future.

Execution of the pheromone infrastructure proceeds on two time scales,running in separate threads.

The first thread updates the book of pages each time the domain timeadvances past the next page boundary. At each step:

1. The former “now+1” page is replaced with a new current page, whosepheromones correspond to the locations and strengths of observed units;

2. An empty page is added at the prediction horizon; and

3. The oldest page is discarded, since it has passed the insertionhorizon.

The second thread moves the ghosts from one page to the next, as fast asthe processor allows. At each step:

1. Ghosts reaching the τ=t page are evaluated for fitness and removed orevolved;

2. New ghosts from the avatars and from the evolutionary process areinserted at the insertion horizon;

3. A population of ghosts based on the fittest ghosts are inserted atτ=t to run into the future;

4. Ghosts that have moved beyond the prediction horizon are removed;

5. All ghosts plan their next actions based on the pheromone field inthe pages they currently occupy;

6. The system computes the next state of each page, including executingthe actions elected by the ghosts, and (in future pages) evaporatingpheromones and recording new deposits from the recently arrived ghosts.

Ghost movement based on pheromone gradients is a very simple process, sothis system can support realistic agent populations without excessivecomputer load. In our current system, each avatar generates eight ghostsper shift. Since there are about 50 entities in the battlespace (about20 units each of Red and Blue and about 5 of Green), we must supportabout 400 ghosts per page, or about 24000 over the entire book.

How fast a processor do we need? Let p be the real-time duration of apage in seconds. If each page represents 60 seconds of domain time, andwe are replaying a simulation at 2× domain time, p=30. Let n be thenumber of pages between the insertion horizon and τ=t. In our currentsystem, n=30. Then a shift rate of n/p shifts per second will permitghosts to run from the insertion horizon to the current time at leastonce before a new page is generated. Empirically, we have found thislevel a reasonable lower bound for reasonable performance, and easilyachievable on stock WinTel platforms.

Information Sources

The flexibility of the BEE's pheromone infrastructure permits theintegration of numerous information sources as input to ourcharacterizations of entity personalities and predictions of theirfuture behavior. Our current system draws on three sources ofinformation, but others can readily be added.

Real-world observations.—Observations from the real world are encodedinto the pheromone field each increment of BEE time, as a new “currentpage” is generated. Table 1 identifies the entities that generate eachflavor of pheromone.

Statistical estimates of threat regions.—An independent process (knownas SAD (Statistical Anomaly Detection) developed by Rafael Alonso, HuaLi, and John Asmuth at Sarnoff Corporation) uses statistical techniquesto estimate the level of threat to each force (Red or Blue), based onthe topology of the battlefield and the known disposition of forces. Forexample, a broad open area with no cover is particularly threatening,especially if the opposite force occupies its margins. The results ofthis process are posted to the pheromone pages as “RedThreat” pheromone(representing a threat to red) and “BlueThreat” pheromone (representinga threat to Blue).

AI-based plan recognition.—BEE is motivated by the recognition thatprediction requires not only analysis of an entity's intentions, butalso its internal emotional state and the dynamics it experiencesexternally in interacting with the environment. While plan recognitionis not sufficient for effective prediction, it is a valuable input. Inthe current system, a Bayes net is dynamically configured based onheuristics to identify the likely goals that each entity may hold. Thisprocess is known as KIP (Knowledge-based Intention Projection). Thedestinations of these goals function as “virtual pheromones.” Asdescribed below, ghosts include their distance to such points in theiraction decisions, achieving the result of gradient following without thecomputational expense of maintaining a pheromone field.

Experimental Results

BEE has been tested in a series of experiments in which human wargamersmake decisions that are played out in a real-time battlefield simulator.The commander for each side (Red and Blue) has at his disposal a team ofpucksters, human operators who set waypoints for individual units in thesimulator. Each puckster is responsible for four to six units. Thesimulator moves the units, determines firing actions, and resolves theoutcome of conflicts.

Fitting Dispositions

To test the system's ability to fit personalities based on behavior, oneRed puckster responsible for four units was designated the “emotional”puckster. His instructions were to select two of his units to becowardly (“chickens”) and two to be irritable (“Rambos”). He did notdisclose this assignment during the run. His instructions were to moveeach unit according to the commander's orders until the unit encounteredcircumstances that would trigger the emotion associated with the unit'sdisposition. Then he would manipulate chickens as though they werefearful (typically avoiding combat and moving away from Blue), and wouldmove Rambos into combat as quickly as possible.

It has been found that the difference between the two disposition values(Cowardice-Irritability) of the fittest ghosts is a better indicator ofthe emotional state of the corresponding entity than either value byitself.

FIG. 4 shows the delta disposition for each of the eight fittest ghostsat each time step, plotted against the time step in seconds, for a unitplayed as a Chicken in an actual run. The values clearly trend negative.

FIG. 5 is a shows a similar plot for a Rambo. Units played with anaggressive personality tend to die very soon, and often do not givetheir ghosts enough time to evolve a clear picture of their personality,but in this case the positive Delta Disposition is clearly evidentbefore the unit's demise.

To distill such a series of points into a characterization of a unit'spersonality, we maintain a 800-second exponentially weighted movingaverage of the Delta Disposition, and declare the unit to be a Chickenor Rambo if this value passes a negative or positive threshold,respectively. Currently, this threshold is set at 0.25. Other filtersmay be used. For example, a rapid rate of increase enhances thelikelihood of calling a Rambo; units that seek to avoid detection andavoid combat are more readily called Chicken.

Table 4 shows the percentages of emotional units detected in a recentseries of experiments. A Rambo is never called a Chicken, andexamination of the logs for the one case where a Chicken is called aRambo shows that in fact the unit was being played aggressively, rushingtoward oncoming Blue forces. Because the brave die young, we almostnever detect units played intentionally as Rambos.

TABLE 4 EXPERIMENTAL RESULTS ON FITTING DISPOSITIONS (16 runs) CalledCorrectly Called Incorrectly Note Called Chickens 68% 5% 27% Rambos  5%0% 95%

In addition to these results on units intentionally played as emotional,there are a number of cases where other units were detected as cowardlyor brave. Analysis of the behavior of these units shows that thesecharacterizations were appropriate: units that flee in the face of enemyforces or weapons fire are detected as Chickens, while those that standtheir ground or rush the adversary are denominated as Rambos.

Integrated Predictions

Each ghost that runs into the future generates a possible future paththat its unit might follow. The set of such paths for all ghostsembodies a number of distinct predictions, including the most or leastlikely future, the future that poses the greatest or least risk to theopposite side, the future that poses the greatest or least risk to one'sown side, and so forth. In the experiments reported here, the futurewhose ghost receives the most guidance from pheromones in theenvironment was selected at each step along the way. In this sense, itis the most likely future.

Assessing the accuracy of these predictions requires a set of metrics,and a baseline against which they can be compared.

Metrics for Predictions

In one embodiment, two sets of metrics may be used. One set evaluatespredictions in terms of their individual steps. The other examinesseveral characteristics of an entire prediction.

The step-wise evaluations are based on the structure summarizedschematically in FIG. 6. Each row in the matrix is a successiveprediction. Each column describes a real-world time step. A given cellrecords the distance between where the row's prediction indicated theunit would be at the column's time, and where it actually was.

The figure shows how these cells can be averaged meaningfully to yieldthree different measures: the prospective accuracy of a singleprediction issued at a point in time, the retrospective accuracy of allpredictions concerning a given point in time, or the offset accuracyshowing how predictions vary as a function of look-ahead depth.

The second set of metrics is based on characteristics of an entireprediction. FIG. 7 summarizes three such characteristics of a path(whether real or predicted): the overall angle θ it subtends, thestraight-line radius τ from start to end, and the actual length λintegrated along the path. A fourth characteristic of interest is thenumber of time intervals τ during which the unit was moving. Each ofthese four values provides a basis of comparison between a predictionand a unit's actual movement (or between any two paths).

AScore (Angle Score).—Let θ_(p) be the angle associated with theprediction, and θ_(a) the angle associated with the unit's actual pathover the period covered by the prediction. Let Δθ=|θ_(p)−θ_(a)|. Theangle score is (with angles expressed in degrees) AScore=1−Min(Δθ,360−Δθ)/180.

If Δθ=0, AScore=1. If Δθ=180, AScore=0. The average of a set of randompredictions will produce a score approaching 0.5.

RScore (Range Score).—Let ρ_(p) be the straight-line distance from thecurrent position to the end of the prediction, and ρ_(a) thestraight-line distance for the actual path. The range score is:RScore=1.0−|ρ_(p)−ρ_(a)|/Max(ρ_(p), ρ_(a)).

If the prediction is perfect, ρ_(p)=ρ_(a), and RScore=1. If the rangesare different, RScore gives the percentage that the shorter range is ofthe longer one. Special logic returns an RScore of 0 if just one of theranges is 0, and 1 if both are 0.

LScore (Length Score).—Let λ_(p) be the sum of path segment distancesfor the prediction, and λ_(a) the sum of path segment distances for theactual path. The length score is: LScore=1.0−|λ_(p)−λ_(a)|/Max(λ_(p),λ_(a)).

If the prediction is perfect, λ_(p)=λ_(a), and LScore=1. If both lengthsare non-zero, LScore indicates what percentage the shorter path lengthis of the longer path length. Special logic returns an LScore of 0 ifjust one of the lengths is 0, and 1 if both are 0.

TScore (Time Score).—Let τ_(p) be the number of minutes that the unit ispredicted to move, and τ_(a) the number of minutes that it actuallymoves. The time score is: TScore=1.0−|τ_(p)−τ_(a)|/Max(τ_(p), τ_(a)).

If the prediction is perfect, τ_(p)=τ_(a), and LScore=1. If both timesare non-zero, TScore indicates what percentage the shorter path lengthis of the longer path length. Special logic returns a TScore of 0 ifjust one of the times is 0, and 1 if both are 0.

Baseline

As a baseline for comparison, a random-walk predictor can beimplemented. This process starts at a unit's current location, thentakes 30 random steps. A random step consists of picking a random numberuniformly distributed between 0 and 120 indicating the next cell to moveto in an 11-by-11 grid with the current position at the center. (Thegrid was size 11 because the BEE movement model allows the ghosts tomove from 0 to 5 cells in the x and y directions at each step.) Therandom number r is translated into x and y steps, Δx, Δy, using theequations Δx=r/11−5, Δy=(r mod 11)−5.

To compile a baseline, the random prediction is generated 100 times, andeach of these runs is used to generate one of the metrics discussedabove. The baseline reported is the average of these 100-instances.

EXAMPLES

FIG. 8 illustrates the three stepwise metrics for a single unit in asingle run. In the case of this unit, BEE was able to formulate goodpredictions, which are superior to the baseline in all three metrics. Itis particularly encouraging that the horizon error increases sogradually. In a complex nonlinear system, trajectories may diverge atsome point, making prediction physically impossible. One would expect tosee a discontinuity in the horizon error if the system were reachingthis limit. The gentle increase of the horizon error suggests that weare not near this position.

FIG. 9 illustrates the four component metrics for the same unit and thesame run. In general, these metrics support the conclusion that thesepredictions are superior to the baseline, and make clear whichcharacteristics of the prediction are most reliable.

The BEE architecture lends itself to extension in several areas. Thevarious inputs being integrated by the BEE are only an example of thekinds of information that can be handled. The basic principle of using adynamical simulation to integrate a wide range of influences can beextended to other inputs as well, requiring much less additionalengineering than other more traditional ways of reasoning about howdifferent knowledge sources come together in impacting an agent'sbehavior.

The initial limited repertoire of emotions is a small subset of thosethat have been distinguished by psychologists, and that might be usefulfor understanding and projecting behavior. The set of emotions andsupporting dispositions that BEE can detect can be extended.

The mapping between an agent's psychological (cognitive and emotional)state and its outward behavior is not one-to-one. Several differentinternal states might be consistent with a given observed behavior underone set of environmental conditions, but might yield distinct behaviorsunder other conditions. If the environment in the recent past is onethat confounds such distinct internal states, one will be unable todistinguish them, and if the environment shifts to a condition in whichthey yield different behaviors, any predictions will suffer. One canprobe the real world, perturbing it in ways that would stimulatedistinct behaviors from entities whose psychological state is otherwiseindistinguishable. BEE's faster-than-real-time simulation can allow theuser to identify appropriate probing actions, greatly increasing theeffectiveness of intelligence efforts.

While BEE has been developed in the context of adversarial reasoning inurban warfare, it is applicable in a much wider range of applications,including computer games, business strategy, and sensor fusion.

Thus, it should be understood that the embodiments and examplesdescribed herein have been chosen and described in order to bestillustrate the principles of the invention and its practicalapplications to thereby enable one of ordinary skill in the art to bestutilize the invention in various embodiments and with variousmodifications as are suited for particular uses contemplated. Eventhough specific embodiments of this invention have been described, theyare not to be taken as exhaustive. There are several variations thatwill be apparent to those skilled in the art.

REFERENCES

-   [1] S. Brueckner. Return from the Ant: Synthetic Ecosystem for    Manufacturing Control. Dr.rer.nat. Thesis at Humboldt University    Berlin, Department of Computer Science, 2000. Available at    http://dochostrz.hu-berlin.de/dissertationen/brueckner-sven-2000-06-21/P-DF/Brueckner.pdf.-   [2] S. Carberry. Techniques for Plan Recognition. User Modeling and    User-Adapted Interaction, 11(1-2):31-48, 2001. Available at    http://www.cis.udel.edu/.about.carberry/Papers/UMUAI-PlanRec.ps.-   [3] J. Ferber and J.-P. Muller. Influences and Reactions: a Model of    Situated Multiagent Systems. In Proceedings of Second International    Conference on Multi-Agent Systems (ICMAS-96), pages 72-79, 1996.-   [4] A. Haddadi and K. Sundermeyer. Belief-Desire-Intention Agent    Architectures, In G. M. P. O'Hare and N. R. Jennings, Editors,    Foundation of Distributed Artificial Intelligence, pages 169-185.    John Wiley, New York, N.Y., 1996.-   [5] A. Ilachinski. Artificial War: Multiagent-based Simulation of    Combat. Singapore, World Scientific, 2004.-   [6] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis.    Cambridge, UK, Cambridge University Press, 1997.-   [7] M. K. Lauren and R. T. Stephen. Map-Aware Non-uniform Automata    (MANA)—A New Zealand Approach to Scenario Modelling. Journal of    Battlefield Technology, 5(1 (March)):27ff, 2002. Available at    http://www.argospress.com/jbt/Volume5/5-1-4.htm.-   [8] F. Michel. Formalisme, methodologie et outils pour la    modelisation et la simulation de systemes multi-agents. Doctorat    Thesis at Universite des Sciences et Techniques du Languedoc,    Department of Informatique, 2004. Available at    http://www.lirmm.fr/.about.fmichel/these/index.html.-   [9] A. Ortony, G. L. Clore, and A. Collins. The cognitive structure    of emotions. Cambridge, UK, Cambridge University Press, 1988.-   [10] H. V. D. Parunak, R. Bisson, S. Brueckner, R. Matthews, and J.    Sauter. Representing Dispositions and Emotions in Simulated Combat.    In Proceedings of Workshop on Defense Applications of Multi-Agent    Systems (DAMAS05, at AAMAS05), pages (forthcoming), 2005. Available    at http://www.altarum.net/.about.vparunak/DAMAS05DETT.pdf.-   [11] H. V. D. Parunak and S. Brueckner. Ant-Like Missionaries and    Cannibals: Synthetic Pheromones for Distributed Motion Control. In    Proceedings of Fourth International Conference on Autonomous Agents    (Agents 2000), pages 467-474, 2000. Available at    http://www.altarum.net/.about.vparunak/MissCann.pdf.-   [12] H. V. D. Parunak, S. Brueckner, M. Fleischer, and J. Odell. A    Design Taxonomy of Multi-Agent Interactions. In Proceedings of    Agent-Oriented Software Engineering IV, pages 123-137,    Springer, 2003. Available at    www.altarum.net/.about.vparunak/cox.pdf.-   [13] H. V. D. Parunak, S. Brueckner, and J. Sauter. Digital    Pheromones for Coordination of Unmanned Vehicles. In Proceedings of    Workshop on Environments for Multi-Agent Systems (E4MAS 2004), pages    246-263, Springer, 2004. Available at    http://www.altarum.net/.about.vparunak/AAMAS04_UAVCoordination.pdf.-   [14] H. V. D. Parunak, S. A. Brueckner, and J. Sauter. Digital    Pheromone Mechanisms for Coordination of Unmanned Vehicles. In    Proceedings of First International Conference on Autonomous Agents    and Multi-Agent Systems (AAMAS 2002), pages 449-450, 2002. Available    at www.altarum.net/.about.vparunak/AAMAS02ADAPTIV.pdf.-   [15] A. S. Rao and M. P. Georgeff. Modeling Rational Agents within a    BDI Architecture. In Proceedings of International Conference on    Principles of Knowledge Representation and Reasoning (KR-91), pages    473-484, Morgan Kaufman, 1991.-   [16] J. A. Sauter, R. Matthews, H. V. D. Parunak, and S. Brueckner.    Evolving Adaptive Pheromone Path Planning Mechanisms. In Proceedings    of Autonomous Agents and Multi-Agent Systems (AAMAS02), pages    434-440, 2002. Available at    www.altarum.net/.about.vparunak/AAMAS02Evolution.pdf.

1. A method of predicting the behavior of an agent in an environment,comprising the steps of: executing a computer simulation of anenvironment including a plurality of software agents; estimating theinternal state of at least one of the agents based upon its behavior inthe simulation, including its movement within the environment; andpredicting the likely future behavior of the agent based upon theestimate of its internal state.
 2. The method of claim 1, wherein theagent's internal state is estimated by examining changes in the agent'sobserved behavior.
 3. The method of claim 1, wherein the agent'sinternal state is estimated in conjunction with a model of theenvironment.
 4. The method of claim 1, wherein the prediction of theagent's future behavior is based in part on the agent's interaction withthe environment.
 5. The method of claim 1, wherein the agents representhuman beings.
 6. The method of claim 1, wherein the simulatedenvironment comprises digital pheromones.
 7. The method of claim 6,wherein the digital pheromones are scalar variables that agents cansense and which they deposit at their current location in theenvironment.
 8. The method of claim 7, wherein the agents respond to thelocal concentrations of the digital pheromones tropistically throughclimbing or descending local gradients.
 9. The method of claim 6,wherein the pheromones run on the nodes of a graph-structuredenvironment.
 10. The method of claim 6, wherein the graph-structuredenvironment is a rectangular lattice.
 11. The method of claim 6, whereineach agent is capable of aggregating pheromone deposits from individualagents, thereby fusing information across multiple agents over time. 12.The method of claim 6, wherein each agent is capable of evaporatingpheromones over time to remove inconsistencies that result from changesin the simulation.
 13. The method of claim 6, wherein each agent iscapable of diffusing pheromones to nearby places, thereby disseminatinginformation for access by nearby agents.
 14. The method of claim 6,wherein the movements of the agents change their deposit patterns. 15.The method of claim 6, wherein the simulation integrates knowledge ofthreat regions, a cognitive analysis of the agent's beliefs, desires,and intentions, a model of the agent's emotional disposition and state,and the dynamics of interactions with the environment.
 16. The method ofclaim 1, wherein the simulation involves urban warfare.
 17. The methodof claim 1, wherein the simulation involves a computer game.
 18. Themethod of claim 1, wherein the simulation involves a business strategy.19. The method of claim 1, wherein the simulation involves a sensorfusion.
 20. A system for predicting the behavior of an agent in anenvironment, comprising: a processor or microprocessor coupled to amemory, wherein the processor or microprocessor is programmed toevaluate search results by: executing a computer simulation of anenvironment including a plurality of software agents; estimating theinternal state of at least one of the agents based upon its behavior inthe simulation, including its movement within the environment; andpredicting the likely future behavior of the agent based upon theestimate of its internal state.